Exploring numerical variables¶

Mapping¶

Source: Plotly

In [1]:
%matplotlib inline
import pandas as pd
import numpy as np
import seaborn as sns 
import matplotlib.pyplot as plt

Import data¶

Here we load a GeoJSON file containing the geometry information for US counties, where feature.id is a FIPS code.

In [2]:
from urllib.request import urlopen
import json
with urlopen('https://raw.githubusercontent.com/kirenz/modern-statistics/main/data/geojson-counties-fips.json') as response:
    counties = json.load(response)

counties["features"][0]
Out[2]:
{'type': 'Feature',
 'properties': {'GEO_ID': '0500000US01001',
  'STATE': '01',
  'COUNTY': '001',
  'NAME': 'Autauga',
  'LSAD': 'County',
  'CENSUSAREA': 594.436},
 'geometry': {'type': 'Polygon',
  'coordinates': [[[-86.496774, 32.344437],
    [-86.717897, 32.402814],
    [-86.814912, 32.340803],
    [-86.890581, 32.502974],
    [-86.917595, 32.664169],
    [-86.71339, 32.661732],
    [-86.714219, 32.705694],
    [-86.413116, 32.707386],
    [-86.411172, 32.409937],
    [-86.496774, 32.344437]]]},
 'id': '01001'}

Here we load unemployment data by county, also indexed by FIPS code.

In [4]:
df = pd.read_csv("https://raw.githubusercontent.com/kirenz/modern-statistics/main/data/fips-unemp-16.csv",
                   dtype={"fips": str})
df.head()
Out[4]:
fips unemp
0 01001 5.3
1 01003 5.4
2 01005 8.6
3 01007 6.6
4 01009 5.5

Additionaly, let's load a data enriched version of our county dataset:

In [5]:
import plotly.express as px

fig = px.choropleth(df, geojson=counties, locations='fips', color='unemp',
                           color_continuous_scale="Viridis",
                           range_color=(0, 12),
                           scope="usa",
                           labels={'unemp':'unemployment rate'}
                          )
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()